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Abstract 

Maximum parsimony is one of the most frequently-discussed tree reconstruc¬ 
tion methods in phylogenetic estimation. However, in recent years it has be¬ 
come more and more apparent that phylogenetic trees are often not sufficient 
to describe evolution accurately. For instance, processes like hybridization or 
lateral gene transfer that are commonplace in many groups of organisms and 
result in mosaic patterns of relationships cannot be represented by a single 
phylogenetic tree. This is why phylogenetic networks, which can display such 
events, are becoming of more and more interest in phylogenetic research. It 
is therefore necessary to extend concepts like maximum parsimony from phy¬ 
logenetic trees to networks. Several suggestions for possible extensions can 
be found in recent literature, for instance the softwired and the hardwired 
parsimony concepts. In this paper, we analyze the so-called big parsimony 
problem under these two concepts, i.e. we investigate maximum parsimo¬ 
nious networks and analyze their properties. In particular, we show that 
hnding a softwired maximum parsimony network is possible in polynomial 
time. We also show that the set of maximum parsimony networks for the 
hardwired dehnition always contains at least one phylogenetic tree. Lastly, 
we investigate some parallels of parsimony to different likelihood concepts on 
phylogenetic networks. 
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1. Introduction 


Maximum parsimony (MP) is a popular tool to reconstruct phylogenetic 
trees from a sequence of morphological or molecular characters. Since there 
is currently an increasing interest in representing evolution as an intertwined 


network (Bapteste et ah, 2013 Morrison, 2011) that accounts for speciation 


as well as reticulation events such as lateral gene transfer or hybridization, it 
is not surprising that consideration is given to extending parsimony to phy¬ 
logenetic networks. Similar to parsimony on phylogenetic trees (reviewed in 


Felsenstejnjl (2004)), one distinguishes the small and big parsimony problem. 


In terms of phylogenetic networks, the small parsimony problem asks for the 
parsimony score of a sequence of characters on a (given) phylogenetic net¬ 
work, while the big parsimony problem asks to hnd a phylogenetic network 
for a sequence of characters that minimizes the score amongst all phyloge¬ 
netic networks. It is the latter problem that evolutionary biologists usually 
want to solve for a given data set, and it is this problem that is the focus of 
this paper. 

Recently, two different approaches for parsimony on phylogenetic net¬ 
works have been proposed, referred to as hardwired and softwired parsimony. 


The hardwired framework, introduced by Kannan and Wheeler (2012), calcu¬ 


lates the parsimony score of a phylogenetic network by considering character- 
state transitions along every edge of the network. A slightly different ap¬ 


proach was taken by Nakhleh et ah (2005), who dehned the softwired parsi¬ 


mony score of a phylogenetic network to be the smallest (ordinary) parsimony 
score of any phylogenetic tree that is displayed by the network under consid¬ 
eration. Although one can compute the hardwired parsimony score of a set 


of binary characters on a phylogenetic network in polynomial time (Semple 


and Steel, 2003), solving the small parsimony problem is in general NP-hard 


under both notions (Fischer et ah, 2015 Jin et al., 2009; Nguyen et ah, 2007). 


In contrast, the small parsimony problem on phylogenetic trees is solvable in 


polynomial time by applying Fitch-Hartigan’s (Fitch, 1971; Hartigan, 1973) 


or Sankoff’s (Sankoff, 1975) algorithm 


Given that it is in general computationally expensive to solve the small 
parsimony problem on networks, effort has been put into the development 
of heuristics (|Kannan and Wheeler 2012), and algorithms that are exact 


2 


















































and have a reasonable rnnning time despite the complexity of the nnderlying 


problem (Fischer et ah, 2015 Kannan and Wheeler, 2014). However, in hnd- 


ing ever qnicker and more advanced algorithms to solve the small parsimony 
problem, an analysis of MP networks nnder the hardwired or softwired no¬ 
tion, and their biological relevance has fallen short. The only exceptions are 


two practical stndies (Jin et ah, 2006, 2007) that aim at the reconstrnction 


of a particular type of a softwired MP network for which the input does not 
only consist of a sequence of characters, but also of a given phylogenetic tree 
T (e.g. a species tree) and a positive integer k. More precisely, this version 
of softwired parsimony adds k reticulation edges to T such that the softwired 
parsimony score of the resulting phylogenetic network is minimized over all 
possible solutions. 

In this paper, we present the hrst analysis of MP networks and reveal fun¬ 
damental properties of such networks that are simultaneously surprising and 
undesirable. For example, we show that an MP network under the hardwired 
dehnition tends to have a small number of reticulations, while an MP net¬ 
work under the softwired dehnition tends to have many reticulations. Even 
stronger, we show that, for any sequence of characters, there always exists 
a phylogenetic tree that is an MP network under the hardwired dehnition. 
While some of our hndings have independently been stated in Wheeler (2015), 
we remark that the author does not give any formal proofs. In conclusion, 
the properties we hnd question the biological meaningfulness of MP networks 
and emphasize a fundamental diherence between the hardwired and softwired 
parsimony framework on phylogenetic networks. We then shift towards max¬ 
imum likelihood concepts on phylogenetic networks and analyze whether or 
not the TuHley-Steel equivalence result for phylogenetic trees also holds for 
networks. It is well known that under a simple substitution model, parsimony 
and likelihood on phylogenetic trees are equivalent (Tuffley and Steel, 1997). 
However, as we shall show, parsimony on networks is not equivalent to one 
of the most frequently-used likelihood concepts on networks. Nevertheless, 
the equivalence can be recovered using functions that resemble likelihoods, 
but are not true likelihoods in a probability theoretical sense. We call these 
functions pseudo-likelihoods. In this sense, the equivalence of the differ¬ 
ent parsimony concepts to pseudo-likelihoods rather than likelihoods can be 
viewed as another drawback of the existing notions of parsimony. 

The remainder of the paper is organized as follows. The next section 
contains notation and terminology that is used throughout the paper. We 
then analyze properties of MP networks under the hardwired and softwired 
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definition in Section Additionally, this section also considers the com¬ 
putational complexity of the big parsimony problem under both dehnitions. 
Then, in Section we re-visit the TufHey-Steel equivalence result for par¬ 
simony and likelihood, and investigate in how far it can be extended from 
trees to networks. We end the paper with a brief conclusion in Section 

Lastly, it is worth noting that our results are presented as general as 
possible. For example, we do not bound the number of character states of any 
character that is considered in this paper. Furthermore, the only restriction 
in the dehnition of a phylogenetic network (see next section for details) is 
that the out-degree of a reticulation is exactly one. As a reticulation and 
speciation event are unlikely to happen simultaneously, this restriction is 
biologically sensible and, in fact, only needed to establish Theorem]^ 

2. Preliminaries 

2.1. Trees and networks 

A rooted phylogenetic tree on X is a rooted tree with no degree-two ver¬ 
tices (except possibly the root which has degree at least two) and whose leaf 
set is X. Furthermore, a rooted phylogenetic tree on X is binary if each 
internal vertex, except for the root, has degree three. A natural extension of 
a rooted phylogenetic tree on X that allows for vertices whose in-degree is 
greater than one is a rooted phylogenetic network M on X which is a rooted 
acyclic digraph that satishes the following three properties: 

(i) X is the set of vertices of in-degree one and out-degree zero, 

(ii) the out-degree of the root is at least two, and 

(hi) every other vertex has either in-degree one and out-degree at least two, 
or in-degree at least two and out-degree one. 

Similar to rooted phylogenetic trees, we call X the leaf set of M. Furthermore, 
each vertex of N whose in-degree is at least two is called a reticulation and 
represents a species whose genome is a mosaic of at least two distinct parental 
genomes, while each edge directed into a reticulation is called a reticulation 
edge. To illustrate, a rooted phylogenetic network on X = {1,2, 3,4} and 
with one reticulation is shown on the left-hand side of Figure Moreover, 
for two vertices u and v in A/", we say that m is a parent of v or, equivalently, v 
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Figure 1: Left: A rooted phylogenetic network Af on leaf set X = {1, 2, 3,4}. Right: The 
two rooted phylogenetic trees 7} and 72 on A displayed by A/". 

is a child of u if {u, v) is an edge in M. Lastly, note that a rooted phylogenetic 
tree on X is a rooted phylogenetic network on X with no reticnlation. 

Let Af he a. rooted phylogenetic network on X and let T be a rooted 
phylogenetic tree on X. We say that T is displayed by Af if, np to contracting 
vertices with in-degree one and ont-degree one, T can be obtained from Af 
by deleting edges and non-root vertices, in which case the resnlting acyclic 
digraph is an embedding of T in Af. Intnitively, if T is displayed by Af, then 
all ancestral information inferred by T is also inferred by Af. The two rooted 
phylogenetic trees 7} and T 2 that are displayed by the network shown on the 
left-hand side of Fignre are presented on the right-hand side of the same 
hgure. Lastly, we use V{Af) to denote the set of all rooted phylogenetic trees 
that are displayed by Af. 

2.2. Characters 

Let G be an acyclic digraph. We denote the vertex set of G by Id {G) and 
the edge set of G by E{G). Furthermore, we call X a distinguished set of G 
if it is a subset of the vertices of G whose out-degree is zero such that, if G is 
a rooted phylogenetic network Af (resp. a rooted phylogenetic tree T), then 
X is precisely the leaf set of Af (resp. T). A character on X is a function y 
from X into a set G of character states. 

Let G be an acyclic digraph with distinguished set X and let y be a 
character on X. An extension of y to Id(G) is a function y from Id(G) to G 
such that x{f) = x{f) each element i & X. For an extension y of y to 
Id (G), we set 


chfeG) = l{(t<,l>) e E{G) : S(ti) / ,\(t0} 


5 


and refer to it as the changing number of x- other words, the changing 
nnmber of x is the nnmber of edges in G whose two endpoints are assigned 
to different character states. Two characters xi X 2 on X are shown 
on the left-hand side of Fignre while possible extensions Xi and X 2 of Xi 
and X 2 , respectively, to the vertex set of the nnderlying rooted phylogenetic 
network Af on fonr leaves are shown in the middle and on the right-hand side 
of the same hgure. Note that ch(xi,A/') = ch(x 2 ,A/') = 2. If G is a rooted 
phylogenetic tree on X, we say that y is homoplasy-free on G if there exists 
an extension x of y to V{G) snch that, for each character state q G G, the 
snbgraph of G indnced by {n G I^(G) : x(n) = c*}, the snbset of vertices 
assigned to the same character state, is connected. Eqnivalently, y is said 
to be homoplasy-free on G if there exists an extension x of x to V"(G) snch 
that ch(x, G) = |G| — 1. Biologically speaking, if x is homoplasy-free on 
a rooted phylogenetic tree, then x can be explained withont any reverse or 
convergent character-state transitions. Note that, for each character x on X, 
there always exists a rooted phylogenetic tree T snch that x is homoplasy-free 
on T, in which case T is said to be a perfect phylogeny for x- 

Now, let T be a perfect phylogeny for a character x on X. It is easily 
checked that any rooted binary phylogenetic tree T' on X with the property 
that T can be obtained from T' by contracting a possibly empty set of 
edges is also a perfect phylogeny for x- We call T' a binary refinement of 
T. The next observation is an immediate conseqnence of the fact that each 
phylogenetic tree has a binary rehnement. 

Observation 1. Let x be a character on X. There exists a rooted binary 
phylogenetic tree T on X that is a perfect phylogeny for x- 


3. Parsimony on Networks 

In this section, we review the different notions of parsimony on networks. 
In particular, we describe the hardwired and softwired notion that have been 
introduced by Kannan and Wheeler (2012) and Nakhleh et ah (2005), respec¬ 
tively. For the softwired notion, we describe three equivalent dehnitions, one 
of which is new to this paper. Moreover, we analyze the big parsimony prob¬ 
lem on networks, and present new and curious properties of MP networks 
under both notions that challenge the biological relevance of such networks. 


6 








X 

Xi 

X2 

1 

a. 

a 

2 

a. 

p 

3 

p 

p 

4 

p 

ex. 



1 

a 


2 

a 


4 

/3 


a 



1 2 3 4 

a /3 j3 OL 


Figure 2: Left: Two characters X\ and X 2 on X, each with two character states a and 
Middle and right: An extension xi (resp. X 2 ) of Xi (resp. X 2 ) to the vertex set of the 
underlying rooted phylogenetic network JV on four leaves. Indicated by the thicker edges, 
note that both extensions yield two edges in Af whose two endpoints are assigned to two 
different character states. 


3.1. Hardwired Parsimony 

The hardwired parsimony score of a character x on an acyclic digraph G 
extends the dehnition of the parsimony score of x on a rooted phylogenetic 
tree in, possibly, the most natural way. Intuitively, the hardwired parsimony 
score of X on G equates to the smallest number of character-state transitions 
over all edges of G that is required to explain x on G. 

Formally, let S = (xi, X 2 , • • • , Xk) be a sequence of characters on X, and 
let G be an acyclic digraph with a distinguished set X. Then, the hardwired 
parsimony score of S' on G is dehned as 

k 

PShard{S,G) = y'min(ch(xi,G)), 

^ Xi 


where, for each character Xi, the minimum is taken over all extensions of x* 
to V{G). We note that the hardwired parsimony score of S' on G coincides 
with that of the (ordinary) parsimony score (Fitch, 1971) if G is a (rooted) 
phylogenetic tree T on X and denote the latter score by PS{S, T). Moreover, 
since a rooted phylogenetic network TV is a special type of acyclic digraph, 
the dehnition of the hardwired parsimony score of S' on G naturally carries 
over to the hardwired parsimony score of S' on M. 

In practice, we are usually not given a rooted phylogenetic network. We 
are simply given a sequence S of characters on X and the aim is to hnd a 
rooted phylogenetic network M on X that has the smallest hardwired parsi¬ 
mony score for S' among all such networks, i.e. PShard{S,J\r) < PShard{S,Af') 
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Figure 3: An example to illustrate Observation where G 2 is obtained from Gi by a 
sequence of edge (indicated by the thicker edges in Gi) and vertex deletions, and G 3 is 
obtained from G 2 by contracting all vertices with in-degree one and out-degree one. Note 
that PShardixi^Gi) = 2 and PS'/jard(xi, G 2 ) = PShardixi^Gs) = 1 , where xi is the 
character shown on the left-hand side of Figure 

for each rooted phylogenetic network A/"' on X. We refer to A/" as a hardwired 
MP network and denote the corresponding parsimony score by PShard{S). 
For example, Figure shows a rooted phylogenetic network M whose hard¬ 
wired parsimony score is PShard{{XijX 2 ),-f^) = 4, where Xi cind X 2 are the 
two characters shown on the left-hand side of the same hgure. 

The hrst main result, Theorem [TJ describes the hrst of our curious prop¬ 
erties for MP networks. Let S' be a sequence of characters on X, and let 
G and G' be two acyclic digraphs with distinguished set X. If G' can be 
obtained from G by deleting an edge, deleting a vertex, or contracting a ver¬ 
tex with in-degree one and out-degree one, then it is easily checked that the 
hardwired parsimony score for S on G' is at most the hardwired parsimony 
score for S on G. We summarize this result in the following observation, for 
which an example is shown in Figure 

Observation 2. For an acyclic digraph G with distinguished set X, deleting 
an edge or vertex in G that is not in X without disconnecting G, or contract¬ 
ing a vertex of G with in-degree one and out-degree one never increases the 
hardwired parsimony score. 

The next theorem follows by taking Observation to an extreme for a 
rooted phylogenetic network Af on X, i.e. deleting edges and vertices, and 
contracting vertices with in-degree one and out-degree one in Af until the 
resulting graph is a rooted phylogenetic tree on X. 

Theorem 1. Let S be a seguence of characters on X. There is always a 
rooted phylogenetic tree on X that is a hardwired MP network for S. 
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Theorem immediately follows from the next lemma which is also used in 
the proof of Theorem 

Lemma 1. Let M he a rooted phylogenetic network on X, and let T he a 
rooted phylogenetic tree on X that is displayed hy Af. Furthermore, let x he a 
character on X, and let x he an extension of x to V{Af). Then, there exists 
an extension Xi of x to V(T) such that ch(x,A/') > ch(;\;i,T). 

Proof. By the definition of displaying, T can be obtained from M by first 
deleting edges and vertices to get a tree T, and then contracting any resulting 
vertices with in-degree one and out-degree one. By construction, 

V{T) C V{T) C V{U). 

Let / be the identity function from V{T) to V(T), and let g be the identity 
function from V{T) to V{Af). Now, let xi be the extension of y to V{T) 
such that Xi(n) = xidifio))) for each vertex v in V{T). Let e = {u,v) be 
an edge of T. Note that e corresponds to a path in T from f{u) to f{v). 
If Xi(-u) 7 ^ Xi{v), then e contributes one to ch(xi,T). Moreover, the edges 
on the path f{u) = Wi,W 2 , ■ ■ ■ ,Wn = f{v) in T, and therefore the edges 
on the path g{f{u)) = g{wi),g{w 2 ), • • ■,g{wn) = g{f{v)) in Af, collectively 
contribute at least one to ch(x. A/"). Summing over all edges in T, we deduce 
that ch(x,A/') > ch(xi,T). □ 

Following on from Theorem [T| it can be shown that each hardwired MP 
network Af for S = (xi, X 2 , ■ ■ ■, Xk) that is not a phylogenetic tree has the 
following property. Let n be a reticulation in Af and, for each i G {1, 2,..., k}, 
let Xi be an extension of Xi such that that X 1 X 2 , ■ ■ ■ ,Xk collectively realize 
PShard{S). Then v and all its parents are assigned to the same character 
state. To justify this comment, assume that there exists a character in S for 
which V and a parent, say p, of v are assigned to two different character states. 
Then deleting the edge (p, v) in Af decreases the hardwired parsimony score. 
Now, by subsequently deleting edges and vertices, and contracting vertices, 
we can always obtain a rooted phylogenetic tree on X from Af which, by 
Observation]^ has the property that its hardwired parsimony score is strictly 
less than that of Af. This contradicts the assumption that A/” is a hardwired 
MP network for S. Hence, from a biological point of view, it seems to be 
sensible to argue that v and all of its parental species have the same genetic 
makeup; thereby indicating that the associated reticulation event is possibly 
redundant. 
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Referring back to Theorem it is not too difficult to see that each rooted 
phylogenetic tree that is a hardwired MP network M for S is, in fact, an 
MP tree for S since, otherwise, M is not optimal. Moreover, by slightly 
strengthening this fact, the next theorem uncovers an interesting property of 
all rooted phylogenetic trees that are displayed by a hardwired MP network 
for 5. 


Theorem 2. Let S be a sequence of characters on X, and let M be a hard¬ 
wired MP network for S. Each rooted phylogenetic tree on X that is displayed 
by Af is an MP tree for S. 

Proof. Let T be a rooted phylogenetic tree on X that is displayed by N. By 
the optimality of A/", we have PShardi.S^N) < PShard{.S,T). Furthermore, 
applying Lemma to each character in S, we also have PShard{S,J\f) > 
PShard{S,T). Hence, 

PShard{S) = PShard{S,U) = PShard{S,T), 


and so T is a hardwired MP network for S. Moreover, since the (ordinary) 
parsimony dehnition for rooted phylogenetic trees coincides with the hard¬ 
wired dehnition when restricted to rooted phylogenetic trees, it follows that 
T is an MP tree for S. This completes the proof of the theorem. □ 


We end this section with a result on the computational complexity of the 
big parsimony problem on phylogenetic networks under the hardwired deh¬ 
nition. Similar to the big parsimony problem on phylogenetic trees ([Foulds 


and Graham, 1982), the next corollary states that it takes exponential time 


to compute a hardwired MP network for a sequence of characters. 


Corollary 1. Let S be a sequence of characters on X. Finding a hardwired 
MP network for S is NP-hard. 


Proof. To prove that the result holds, assume the contrary. Then it takes 
time polynomial in the size of X and S to calculate a hardwired MP network 
Af for S. Let T be any rooted phylogenetic tree on X that is displayed by Af. 
By Theorem T is an MP tree for S. Since such a tree can be constructed 
from Af in polynomial time, this contradicts the fact that calculating a max¬ 
imum parsimony tree for S is NP-hard (Foulds and Graham, 1982). Hence, 
calculating a hardwired MP network for S is NP-hard. □ 
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On the positive side, it is however worth noting that in order to hnd a 
hardwired MP network for a sequence of characters on X, by Theorem [T| it 
suffices to search through all rooted phylogenetic trees on X instead of the 


greatly enlarged space of all rooted phylogenetic networks on X (McDiarmid 


et al., 2015). 


3.2. Softwired Parsimony 

While the evolution of a set of species whose past is likely to include 
reticulation events can often be best represented by a phylogenetic network, 
the evolution of a particular gene or DNA segment can generally be described 
without reticulation events and therefore be represented by a phylogenetic 
tree. Hence, it seems plausible to assume that the evolution of a character, 
which is often associated with a gene or a single nucleotide, can also be 
represented by a tree. Using this idea, the softwired parsimony score of a 
character x on a rooted phylogenetic network A/" is dehned to be the smallest 
number of character-state transitions that is necessary to explain y on any 
tree that is displayed by J\f. 

Formally, we have the following dehnition. Let S = (xi, X 2 , • • •, Xk) be a 
sequence of characters on X, and let TV be a rooted phylogenetic network on 
X. Then, the softwired parsimony score of S' on W is dehned as 




E min min(ch(xi, T)) = > min PSixuT), 

T&V{N) Xi ^T&V{N) 

i=l i=l 


where, for each character Xi; fhe hrst minimum is taken over all rooted 
phylogenetic trees T on X displayed by A/" and the second minimum is taken 
over all extensions of Xi to ^0~)- Similar to the previous section, it is 
worth noting that, if A/" is a rooted phylogenetic tree, then the (ordinary) 


parsimony score (Fitch, 1971) of S' on A/" is equal to the softwired parsimony 


score of S' on Af. Lastly, we refer to A/" as a softwired MP network and 
denote the corresponding parsimony score by PSsoft{S) if AT has the smallest 
softwired parsimony score for S' among all such networks, i.e. PSsoft{S,Af) < 
PSsoft{S,Af') for each rooted phylogenetic network Af' on X. To illustrate. 
Figure 1^ shows a rooted phylogenetic network Af' with PSsoftiX2,Af') = 1, 
where X 2 is the character that is shown on the left-hand side of Figure 
Indeed, it is easily checked that Af' is a softwired MP network for X 2 - 

We next describe our second curious property for MP networks. Let Af 
and Af' be two rooted phylogenetic networks on X such that Af' can be 


11 










Figure 4 : A rooted phylogenetic network N' on X = { 1 , 2 , 3 , 4 } that displays the three 
rooted phylogenetic trees 7 i, 72, and Ts on A that are shown on the right-hand side. With 
X2 being the character shown on the left-hand side of Figure we see that X2 can be ex¬ 
plained on Ta with just one character-state transition while two character-state transitions 
are necessary to explain X2 on each of 7 i and 72- Hence, we have PSsoft{X2,-^') — 1 - 
Moreover, to illustrate Observation note that the rooted phylogenetic network Af' can 
be obtained from the network J\f that is shown in Figure by adding an edge that joins 
two new non-leaf vertices (indicated by the thicker edge). Since Af does not display Ta, 
we have PSsoft{x2,J^) > PSsoft{x2,-^')- 


obtained from Af by subdividing two edges and adding a new edge joining 
the two new vertices. Since the collection of rooted phylogenetic trees dis¬ 
played by TV is a subset of the collection of rooted phylogenetic trees that 
are displayed by A/', the next observation, which is in stark contrast to Ob¬ 
servation 1^ is an immediate consequence of the dehnition of the softwired 
parsimony score. 


Observation 3. Adding an edge joining two new non-leaf vertices to a rooted 
phylogenetic network never increases the softwired parsimony score. 


This observation was hrst mentioned by |Nakhleh et ah (2005), who noticed 
that networks with a large number of reticulations tend to have a smaller 
parsimony score. An example to illustrate Observationj^is shown in Figure]^ 
Perhaps surprisingly, in comparison to what happens under the hardwired 
notion, the next theorem states that solving the big parsimony problem on 
networks under the softwired dehnition for a sequence S of characters on 
X is not NP-hard. Intuitively, this can be justihed by noting that it takes 
polynomial time to construct a rooted phylogenetic network Af that displays 
a perfect phylogeny on X for each character in S. It then follows that Af is 
a softwired MP network for S. 
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In the proof of the next theorem, we make use of a construction in Fran¬ 
cis and Steel (2015). In particular, the authors describe a construction of a 


rooted phylogenetic network M on X that displays all rooted binary phyloge¬ 
netic trees on X. Additionally, they have shown that M can be constructed 
from a rooted binary phylogenetic tree on X by adding ^n(n — 1)^ edges to 
T, where each such edge joins two vertices that subdivide edges in T and 
n = |X|. Note that, as T has 0{n) edges, J\f has edges. In what fol¬ 

lows, we call a rooted binary phylogenetic network on X a universal network 
on X if it displays all rooted binary phylogenetic trees on X. 

Theorem 3. Let S be a sequence of characters on X. Finding a softwired 
MP network for S is solvable in time polynomial in the size of X. 

Proof. Let n = |X|, and suppose that S = (xi, X2, ■ ■ ■, Xfc) is a sequence 
of characters on X. Let A/" be a universal network on X whose number of 
edges is polynomial in n. By the paragraph prior to this theorem, such a 


network exists (Francis and Steel, 2015). Hence, M can be constructed in 


time polynomial in n. 

We complete the proof by showing that W is a softwired MP network for 
S. For each i G {1, 2,..., fc}, let rj denote the number of distinct character 
states of Xii and let Ti be the unique rooted tree with exactly ri + 1 internal 
vertices that has the following properties. The set of vertices of Ti with out- 
degree zero is precisely X, the root of Ti is adjacent to r* internal vertices, 
and two elements in X, say i and i', are adjacent to the same internal vertex 
of Ti if and only if Xi(^) = Now, obtain a rooted phylogenetic tree 

Ti on X from T by contracting each vertex with in-degree one and out- 
degree one. Let Tf be any binary rehnement of %. It is easily checked that 
PS{xi)'Ti) = PS{xiy'Ti) = ri — 1. In particular, Xi is homoplasy-free on If 


and, hence, by Proposition 5.1.3 of Semple and Steel (2003), If is an MP 


tree for Xi- Furthermore, by construction, W displays Tf. It now follows that 


PSsoft{S,J\f) = Y,P-l = PSsoftiS). 


2 = 1 


This completes the proof of the theorem. □ 

From a practical viewpoint, the construction in the proof of Theorem 
implies that one can construct a softwired MP network for an arbitrary se¬ 
quence S of characters on X without looking at the data by simply con¬ 
structing a universal network on X. 


13 












We end this subsection with two equivalent ways of viewing the softwired 
notion of the parsimony score of a sequence of characters on a phylogenetic 
network. The hrst is due to Fischer et ah (2015). Let S' be a sequence 
of characters on X. Furthermore, let A/" be a rooted phylogenetic network 
on X, and let T be a rooted tree with a distinguished set X. Note that 
T is not necessarily a phylogenetic tree. Then T is called a switching of X" 
if it can be obtained from W by deleting, for each reticulation v, all but 
one edge directed into v. It is easily checked that each rooted phylogenetic 
tree on X that is displayed by Af can be obtained from a switching of Af 
by repeated applications of the following two operations: deleting unlabeled 
vertices of degree one, and contracting vertices with in-degree one and out- 
degree one. Conversely, each switching of Af can be transformed into a rooted 
phylogenetic tree on X that is displayed by Af by repeated applications of 
the same two operations. 

Now, let S{Af) denote the set of all switchings of a rooted phylogenetic 
network Af. The next theorem allows us to work with S{Af) instead of the 
set of all trees that are displayed by Af to compute PSsoft{S,Af). 


Theorem 4 (Lemma 4.5 of Fischer et ah (2015)). Let S = • • • Xk) 


be a sequence of characters on X, and let Af he a rooted phylogenetic network 
on X. Then, 


k 

PSsoft{S,Af) = ^ mm m_in(ch(W,T)), 

2=1 

where, for each character Xi, the first minimum is taken over all switchings 
T of Af and the second minimum is taken over all extensions of Xi to V(T). 

The second equivalence is new to this paper and requires a new dehnition. 
Let X be a character on X, and let A/" be a rooted phylogenetic network on 
X. Furthermore, let y be an extension of y to V{Af). For a reticulation edge 
(■u, v) of Af, we say that (m, v) is a negligible edge under x if x{u) ^ x(r’) but 
there exists a parent p oi v m. Af such that x{p) = x(t'). We use n(;\;,A/') to 
denote the number of negligible edges under y in Af. For example, in Figure]^ 
the extension yi of xi to the vertex set of the rooted phylogenetic network 
Af shown in the middle of the same Figure has n(yi,A/') = 1. The next 
theorem shows how a hardwired-type approach that considers the number of 
negligible edges can be used to compute the softwired parsimony score. 
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Theorem 5. Let S = (xi) X 2 , • • •, Xk) be a sequence of characters on X, and 
let M he a rooted phylogenetic network on X. Then 

k 

PSsoft{S,N') = y'min(ch(xi,A/') -n(xi,A/')), 

where, for each character Xi, the minimum is taken over all extensions of Xi 
to V{N'). 

Proof. To establish the theorem, it suffices to show that the result holds 
when S consists of a single character x, that is 

PSsoft(,X,^f) = min(ch(x, A^) - n(x,A^)). 

X 

Throughout the proof, we make use of Theorem and consider the set of 
all switchings of Af to compute PSsoftiXi-^)- Let xi be an extension of x 
to V{M) such that ch(xi,A/') — n(xi,A/’) = m_in(ch(x. A/") — n(x,A/')). Fur- 

X 

thermore, let Ti be a switching of M such that for each reticulation v oi M 
that has a parent with Xi('^) = Xi{Pv), the edge {pv,v) is an edge of Ti. 
Since V{Ti) = V{Af), it follows that, by taking the identity function from 
V{Af) —>■ l^(Ti), we can view Xi as an extension of x to V{Ti). Hence, 

min(ch(x, AT) - n(x, AT)) = ch(xi, A^) - n(xi, A^) 

X 

> ch{xi, Ti) > PSsoft{x,^f), (1) 

where the second inequality follows from Theorem 

Now, by Theorem there is a switching T 2 of Af and an extension X 2 of 
X to V(T 2 ) such that ch(x 2 ,T 2 ) = PSsoftiXiff)- We next show that X 2 can 
always be chosen so that, for each edge (u, v) in T 2 that is a reticulation edge 
in Af, we have X 2 {u) = X 2 {v). 

Let {u, v) be an edge in T 2 that is a reticulation edge in N and whose two 
endpoints are assigned to two different character states, i.e. X 2 (w) 7 ^ X 2 ('i^)- 
Furthermore, let xs be the extension of x to V{T 2 ) such that X 3 ('^) = X 2 {u), 
and x.^{w) = X 2 {w) for each vertex w of T 2 other than v. Recalling that, by 
the dehnition of a rooted phylogenetic network, v has exactly one child, it now 
follows that the contribution of the edges incident with v in T 2 to ch(x 2 ;F 2 ) 
is at least the contribution of those edges to ch(x 3 , T 2 ). In particular, by the 
optimality of X 2 , we have ch(x 2 ,T 2 ) = ch(x 3 ,T 2 )- Setting X 2 to be xs and 
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repeating this argument for each edge in T 2 that is a reticulation edge in M 
and whose two endpoints are assigned to two different character states, it 
follows that we eventually obtain an extension X 2 of y to (T 2 ) that realizes 
PSsoftiXy^) and has the property that, for each edge {u,v) of T 2 that is 
a reticulation edge in M, we have X 2 {u) = X 2 {'v). Furthermore, as T 2 is a 
spanning tree of Af, the extension X 2 is also an extension of y to Id (A/"). It 
is now easily checked that each reticulation edge in Af that is not an edge in 
T 2 either has two endpoints that are assigned to the same character state or 
is a negligible edge under X 2 , and so 

PSsoft{x,-f^) = ch{x2,T2) = ch{x2,Af)-n{x2,Af) 

> min(ch(y,A/)-n(y,A/)). (2) 

X 

Combining the two inequalities Q and (|^ establishes the theorem. □ 

Intuitively, in the second equivalence, we do not ‘penalize’ reticulation edges 
( m , v) directed into a reticulation v whose endpoints are assigned to different 
character states provided there is at least one reticulation edge (p, v) whose 
endpoints are assigned to the same state. 


4. Connections between Maximum Parsimony and Maximum Like¬ 
lihood on phylogenetic networks 

When considering MP on networks, a natural question is which properties 
of MP on trees still hold in the more general setting of phylogenetic networks. 
One well-known property of MP on trees is its equivalence with maximum 


likelihood (ML) (Tuffley and Steel, 1997) under the symmetric r-state model 


with ‘no common mechanism’. We will briefly introduce this model and 
the equivalence result here before we analyze its parallels to phylogenetic 
networks. 

Before we state the results, we introduce some dehnitions. Recall that 
the symmetric r-state model, which is also often called Nr-model (and Jukes 
Cantor model for r = 4 character states ( Jukes and Cantor} 1969| )), is dehned 
as follows. Let A/" be a rooted phylogenetic network, and let {ci, C 2 ,..., c^} be 
r distinct character states with r >2. The TC-model assumes a uniform dis¬ 
tribution of states at the root of Af and equal rates of substitutions between 
any two distinct character states (Neyman, 1971| . Under the A^^-niodel, we 
denote by p(e) the probability that a substitution of a character state q 
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by another character state Cj occurs on some edge e G E{Af) for q 7^ Cj. 
Furthermore, let q{e) = 1 — (r — l)p{e) denote the probability that no sub¬ 
stitution occurs on edge e. Then, in the iV^-model, we have 0 < p(e) < - for 
all e G E^AT) and (r — l)p(e) -|- g(e) = 1. Note that the Nj.-model is time- 
reversible, i.e. it does not matter where the root of a network is placed, and 
the rate of change from state Cj to Cj is the same as that from Cj to c*. Lastly, 
we assume that, if a sequence consists of at least two characters, then the 


different characters have evolved under no common mechanism (TufHey and 


Steel, |1997). This means that the substitution probabilities on the edges of 


the underlying network M may be different for each character in the sequence 
without any correlation between them. 

We will now turn our attention to likelihood concepts. Let T be a rooted 
phylogenetic tree and let x be a character on X. Recall that the probabil¬ 
ity P(x|T, of x, for a given probability vector P^ for character-state 
transitions on the edges of T, is the probability that a root state evolves 
along T to the joint assignment of leaf states induced by x- Furthermore, 
we have P(x|T, P^) = I "7”, P^), i.e. the likelihood of x on T can 

be calculated as the sum of the likelihoods of all possible extensions of x to 
V (T) (Felsenstein, 1981). The ML of x on T, denoted by max P(x|T), is the 
value of P(x|7~, P') maximized over all possible assignments of substitution 
probabilities P^, i.e. max P(x|T) = maxP(x|T,P^). Moreover, the trees 

for which maxP(x|T) is maximum are called ML trees. 


We are now in a position to state the equivalence result of MP and ML 
for trees. 


Theorem 6 (Theorem 5 of Tuffley and Steel ( 1997[ )). Let E he a phylogenetic 
tree on X, and let S = (xi, X 2 , ■ ■ •, Xfc) he a sequence ofr-state characters on 
X. Then, under the Nr-model with no common mechanism, 


maxP(^ I T) = 


(3) 


Thus, ML and MP hath choose the same tree(s). 

Note that Theorem not only implies that both methods choose the 
same optimal sets of rooted phylogenetic trees, but rather that both methods 
induce the same ranking of trees. This means that whenever a phylogenetic 
tree T on X has a lower parsimony score than another such tree T', then 
Equation (|^ implies that the likelihood of T is higher than that of E', which 
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means that T will both be more parsimonious and more likely than Th We 
will later use this fact to directly establish a similar equivalence result for 
the softwired parsimony setting on phylogenetic networks. 

4.I. Softwired Likelihood Functions 

We now turn to likelihood on phylogenetic networks. Let A/” be a rooted 
phylogenetic network on X, and let y be an r-state character on X. Let P-^ = 
(p(ej) : Ci G E{M)) denote the vector of the probabilities p{ei) of a character- 
state transition on the edges of A/” under the X^-model. Furthermore, let T 
be a rooted phylogenetic tree on X that is displayed by Af, and let be an 
embedding of T in Af. Note that is not necessarily unique. We dehne a 
substitution probabilities vector = (p'(e') : e'- G E{T)) assigned to the 
edges of T as follows. For each edge e'- in T that corresponds to a unique 
edge Cj in Af (and we set p'(e' ) = p{ej). Otherwise, e'- corresponds to a 
path of edges in Af (and £'^). If e'- corresponds to exactly two edges, say e* 
and Cfc, in A/”, we set 

P'(e') = p{ef) + p(efc) - r ■ p(ej)p(efc). 

This dehnition considers the amount of change on both edges Cj and e^, which 
correspond to e'- as well as the r possible situations where a change on Ck 
undoes a change on so that there is no change occurring on e'y This last 
part is subtracted. Moreover, p'(e() is equal to 0 precisely when both values 
p(ej) and p(efc) are 0, else it is positive. If e(- corresponds to a path of I edges 
in Af with / > 2, we iteratively apply the above equation / — 1 times. We call 
P^^ a restriction of P^ to T under the Xr-niodel. Furthermore, we denote 
by P^ a restriction of to T for which the probability of observing y given 
T and P^ is maximized over all embeddings of T in Af, i.e. 

P(x|r,P^)=maxP(x|r,P^^). 

per 

We next show that, for a frequently-used likelihood function, the equiva¬ 
lence of parsimony and likelihood on phylogenetic networks no longer holds. 
Let A/” be a rooted phylogenetic network on X, let y be an r-state character 
on X, and suppose that P'^ is a vector of substitution probabilities on the 
edges of Af under the X,,-model with no common mechanism. Furthermore, 
let T be a rooted phylogenetic tree on X that is displayed by A/”. We denote 
by P(T I Af, x) the probability that T is chosen amongst all trees that are 
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Figure 5: A rooted phylogenetic network A/” on A = {1, 2,3,4} with one reticulation and 
the two phylogenetic trees 71 and 72 it displays. The character x that is associated with 
the leaf labels in Af is also depicted together with a most parsimonious extension to the 
inner vertices. The dashed edges represent character-state transitions. 


displayed by A/”. The above-mentioned likelihood function is the following, 
which can be found, for example, in [Nakhleh (2011). 

P^-softix I AT, P^) = ^max ^(^(r I AT, x) • P{x I T, P^)). 


We call this the weighted softwired likelihood^ and the maximum of the 
weighted softwired likelihood over all probability assignments P-^ will be 
denoted by maxP^_soji(x | A/"). Biologically, it makes sense to distinguish 
between trees which are likely to be chosen and those which are not. However, 
softwired parsimony and weighted softwired likelihood are not equivalent on 
phylogenetic networks. More formally, we show, by means of a counterexam¬ 
ple that consists of a single character, that 

maxP^.,oji(x I U) = 


does not hold. 

Consider the rooted phylogenetic network A/” and the 2-state character x 
on the leaves of A/” shown in Figure In the same hgure, the two rooted 
phylogenetic trees 71 and 72 on X are precisely the trees displayed by Af. 
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Here, we have -P5'(x, 7i) = 1 and PS{Xy ^ 2 ) = 2 and, hence PSsoftiX) ■^) = 1- 
Moreover, assnming the iV 2 -niodel with no common mechanism, by Theorem 
1^ we have 

maxP(x I Ti) = = 2-^-^ = i ( 4 ) 

and 

maxP(x I T 2 ) = = 2“^“^ = -, (5) 

8 

where k is the nnmber of characters nnder consideration. For the maximum 
of the weighted softwired likelihood, we have 


max Pu,-soft{x I AT) = max (P(r | AT, x) ■ max P(x | T)) 

T&{Tij2} 


= max < P(ri I A^, x) ■ PiX2 I AT, x) ■ ^ , 


where the last equality follows from Equations Q and ([^. 

Now, if we assume, for example, that T 2 is chosen three times as often as 
7i, i.e. P(7i I A/”, x) = I and P(72 I A/", x) = |, then we have 

r. / , «rx f 1 1 3 1 1 3 

maxP^_,o/t(x I N) = max | 4 ' 4’ 4 ' g | = 

Not only is this weighted softwired ML value unequal to 
it is also achieved by tree 72, whereas 7i is strictly better than T 2 in the 
softwired parsimony sense. Consequently, under the weighted dehnition of 
softwired likelihood, the equivalence between parsimony and likelihood on 
networks fails. 

Next, we consider a second softwired likelihood concept on phylogenetic 
networks which was introduced in Barry and Hartigan (|1987| and analyzed by 


Steel and Penny (2000). We call this concept the softwired pseudo-likelihood. 


An explanation of why we call it pseudo-likelihood is given later. Let x be a 
character on X. Given a rooted phylogenetic network M on X and a vector 
P-^ of substitution probabilities on the edges of TV, we dehne the softwired 
pseudo-likelihood Psoftix I A7, P-^) of x to be 

PsoMx I V, P^) = ^max ^ P(x I T, PP) = ^ P(x I T, pP), 
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where the maximum is taken over all rooted phylogenetic trees T on X that 
are displayed by M and the summation is taken over all extensions y of y to 

ViT). 

Now, the softwired pseudo-ML of y on A/" as the maximum value of P(y | 
Ti P^) of the most likely rooted phylogenetic tree on X which is displayed 
by AT. That is, the softwired pseudo-ML is dehned by 

maxPso«(y I M) = max maxPfy I = max max > P(y I 

•' T&V{N) pT ' T&V{N) pr ^ ’ 

X 

where, for a rooted phylogenetic tree T displayed by TV, the inner maximum 
is taken over all vectors of substitution probabilities on the edges of T under 
the iV^-model. A (not necessarily unique) softwired pseudo-ML network of y 
is a network for which the softwired pseudo-ML is maximum, i.e. 


argnmx[maxPso/i(y | Af)]. 

Note that, by dehnition of the A^^-model with no common mechanism, the 
softwired pseudo-ML for a sequence of r-state characters S = (yi, X 2 , • • •, Xk) 
on X can be calculated as the product of the pseudo-likelihoods of the indi¬ 
vidual characters due to independence. Hence, we have 

k 

mnxPsoft{S I Af) = ]^maxPso/t(yi | Af). 

i=l 


It is worth noting that the above dehnition of a pseudo-likelihood on a 
phylogenetic network Af on X does not incorporate a probability distribution 
on the trees that are displayed by Af. This is the reason, why we refer to it 
as pseudo-likelihood. In fact, if one sums up the softwired pseudo-likelihoods 
Psoftix I Af,P-^) over all possible characters y on X, the sum might be 
larger than 1. However, this pseudo-likelihood has been discussed before 
in a different context (e.g. in Barry and Hartigan (1987); [Steel and Penny 


( 2000| )), and it turns out to be strongly related to the softwired parsimony 
notion for phylogenetic networks. Specihcally, using Theorem [^ and the fact 
that not only the optimal trees are the same, but the entire ranking induced 
by parsimony and likelihood is identical, we next show that softwired MP 
and softwired pseudo-ML on networks are equivalent. 


Theorem 7 (Equivalence of softwired MP and softwired pseudo-ML for 
networks). Let Af be a rooted phylogenetic network on X, and let S = 
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(Xi) X 2 , • • •, Xfc) be a sequence of r-state characters on X. Then, under the 
Nr-model with no common mechanism, 

maxP,o/i(5 I N) = 


Thus, softwired MP and softwired pseudo-ML both choose the same net- 
work(s). 

Proof. We first consider the case fc = 1, i.e. S = (xi)- By Theorem and 
recalling the dehnition of the softwired parsimony score on M, we have 


vmxxPsoftixi I A^) 


max max p(xiir,p^) 
T&V{N) pT ' 


= max maxPlvi I T) 
T&v{N) 


= max r 
T&V{N) 


PS{xiT)-i 


= j.-PSso{t{xiX)-i 


Now, for a seqnence S = (xi, X 2 , ■ ■ ■ ,Xk) of characters, we have 

k 

max Psoft{S I A/") = 

i=l 

k. 

- T,iPSsoft{x^X)+l) 

- rp Z = 1 

= ^-PSsoft{SM)-k^ 

where the hrst eqnality follows from the fact the characters are independent 
nnder the no common mechanism model and the third eqnality follows again 
from the dehnition of the softwired parsimony score on W. This completes 
the proof. □ 


4 . 2 . A Hardwired Likelihood Functions 

In this section, we analyze a hardwired notion of likelihood on networks. 
Let A/” be a rooted phylogenetic network on X, and let x be an extension of an 
r-state character y on JX to V{N'). Fnrthermore, let P-^ = (p(e) : e G E{M)) 
be a snbstitntion probabilities vector assigned to edges of M nnder the Nr- 
model. We set the likelihood of y on given P-^ to be 

p(xiA/',p^)=i n n «(«)• 

e=(hi,p): e=(ii,p): 

xWt^xA) xiu)=xiP 


22 


where the first product considers all edges in E{M) whose two endpoints are 
assigned to two distinct character states and the second product considers 
all edges in E{M) whose two endpoints are assigned to the same character 
state. Then, the hardwired pseudo-likelihood of observing x on A/” for a given 
under the iV^-model is dehned as 

I v, = Xi p{x I v, p^), 

X 

where the summation is taken over all extensions x of x to V{N'). 

Now, the hardwired pseudo-ML of x on A/”, denoted by max P;iard(x I ■^)) 
is the maximum of Phardix I P'^) over all P'^. Hence, 

m&^Phardix I A/”) = Phardix I A/',P-^). 

pN 

Finally, a (not necessarily unique) hardwired pseudo-ML network of x is a 
network for which the hardwired pseudo-ML is maximum, i.e. 

argmax[maxP/ia^rf(x | A/")]. 


As for softwired, the hardwired maximum pseudo-likelihood score for a 
sequence of characters S = (xi;X 2 , • • • ,Xfc) on X can be calculated as the 
product of the pseudo-likelihoods of the individual characters due to inde¬ 
pendence, i.e. 


k 

m&^Phard{S I A/") = JJmaxP,ja^rf(Xi | M). 

i=l 

As for parsimony, for a rooted phylogenetic tree T on X, we remark that 
the softwired and the hardwired dehnitions of ML on networks are equal and 
they also coincide with maxP(S' | T). Thus, we have 

maxP^o/t(S' I T) = ma.yiPhard{S \ T) = maxP{S \ T). 

Moreover, we have the following eqnivalence resnlt. 

Theorem 8 (Eqnivalence of hardwired MP and hardwired psendo-ML for 
networks). Let N be a rooted phylogenetie network on X, and let S = 
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(Xi) X 2 , • • •, Xfc) be a sequence of r-state characters on X. Then, under the 
Nr-model with no common mechanism, 

mscxPhardiS I M) = 

Thus, hardwired MP and hardwired pseudo-ML both choose the same net- 
work(s). 

The proof of Theorem is a rather technical generalization of the proof of 
Theorem]^ in Tnffley and Steel (1997). In particnlar, the proof exploits the 
fact that, if two leaves of a rooted phylogenetic network M are in a different 
character state, then all paths in M that connect the two leaves contain at 
least one edge whose two endpoints are assigned to two different states. We 
omit the details of this proof. 

We end this section with a remark. Recall that the likelihoods of all char¬ 
acters on an arbitrary phylogenetic tree snm np to one. Since a phylogenetic 
network M can be obtained from some phylogenetic tree T by adding edges, 
it follows that the the likelihoods of all characters on M may snm np to a 
valne that is strictly less than one because each likelihood on T will be mul¬ 
tiplied with the substitution probability on each additional edge in M. This 
is the reason, why we refer to Phardix I A/”, P-^) as pseudo-likelihood. 


5. Conclusion 

The small parsimony problem on networks has recently attracted con¬ 
siderable attention. In particular, several related complexity questions have 


been settled ( 

Fischer et ah, 

2(1151 

Jin et ah 

2009 

Nguyen et ah, 

2007 

) and 

exact algorithms and heuristics (Fischer et ah 

2015 

Kannan and Wheeler 


2012, 2014) to tackle this problem have been proposed. In contrast, the big 


parsimony problem has so far only been mentioned in one article ( Wheelerf 
2015), where formal proofs were omitted. Yet, the big problem is exactly 


what is ultimately of interest to evolutionary biologists who wish to recon¬ 
struct a rooted phylogenetic network from molecular data under a parsimony 
framework. 

In this paper, we have presented the hrst formal analysis of MP networks 
and uncovered several curious properties of such networks, including inter¬ 
esting parallels to functions that resemble likelihood functions. Depending 
on whether one reconstructs an MP network under the hardwired or soft- 
wired framework, it is potentially either overly simple (under hardwired) or 
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overly complex (under softwired) in terms of the number of reticulations. 
Consequently, under both notions, the biological relevance of MP networks 
is challenged. In particular, the results in this paper show that neither hard¬ 
wired nor softwired MP can distinguish between evolutionary histories that 
are best represented by a phylogenetic tree and histories that are best rep¬ 
resented by a network. It suggests that we need to reconsider the dehnition 
of parsimony on networks and to develop a new or improved framework. 

One such improvement, that we propose, is to consider an extension of 
the softwired parsimony dehnition that computes the parsimony score of 
a sequence S of characters on a rooted phylogenetic network M by hrst 
computing PSsoft{S,M) and then increasing this score by a certain user- 
dehned ‘penalty’ for each reticulation in A/”. Unless the penalty is set to 
zero, an MP network for S under this new dehnition is unlikely to have 
a high nnmber of reticnlations and, similarly, unless the penalty is set to 
inhnity, snch a network for S is nnlikely to be a tree. Since evolntionary 
biologists often have valuable information at their hngertips as to whether 
the expected amount of reticulation is signihcant or not for a certain data 
set, this information can be used to compute parsimonious networks that are 
biologically more meaningfnl than those reconstructed under the hardwired 
or softwired dehnition. In particular, if a high amount of reticulation is 
expected (e.g. as for certain groups of bacteria or plants) the penalty should 
be smaller than in the case for when one expects the evolutionary history to 
be almost tree-like. 

Concerning the parallels of both the softwired and hardwired parsimony 
concepts to likelihood concepts on phylogenetic networks, we showed in the 
previous section that the eqnivalence fails as soon as a more meaningfnl like¬ 
lihood concept, which assigns probabilities to all trees that are displayed 
by a network, is applied. However, it is easily seen that, if all such trees 
have the same probability, the rankings snggested by softwired parsimony 
and weighted softwired likelihood are identical and, thus, softwired MP and 
weighted softwired ML choose the same optimal networks. On the other 
hand, if one wanted to employ a (more biologically plausible) non-uniform 
distribntion on the trees displayed by a phylogenetic network, we conjecture 
that softwired parsimony and weighted softwired likelihood are eqnivalent if 
one changes the dehnition of softwired parsimony in a way that assigns a 
snitable scaling factor to each displayed tree. For fntnre research, it will be 
interesting to prove this conjectnre and to analyze how other dehnitions of 
likelihood on networks relate to parsimony. 
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